Achieving Privacy in the Adversarial Multi-Armed Bandit

نویسندگان

Aristide Charles Yedia Tossou

Christos Dimitrakakis

چکیده

In this paper, we improve the previously best known regret bound to achieve -differential privacy in oblivious adversarial bandits from O(T / ) to O( √ T lnT/ ). This is achieved by combining a Laplace Mechanism with EXP3. We show that though EXP3 is already differentially private, it leaks a linear amount of information in T . However, we can improve this privacy by relying on its intrinsic exponential mechanism for selecting actions. This allows us to reach O( √ lnT )-DP, with a regret of O(T ) that holds against an adaptive adversary, an improvement from the best known of O(T ). This is done by using an algorithm that run EXP3 in a mini-batch loop. Finally, we run experiments that clearly demonstrate the validity of our theoretical analysis.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Online Linear Optimization through the Differential Privacy Lens

We develop a simple and powerful analysis technique for perturbation style online learning algorithms, based on privacy-preserving randomization, that exhibits a suite of novel results. In particular, this work highlights the valuable addition of differential privacymethods to the toolkit used to design and undestand online linear optimization tasks. This work describes the minimax optimal algo...

متن کامل

Stochastic and Adversarial Combinatorial Bandits

This paper investigates stochastic and adversarial combinatorial multi-armed bandit problems. In the stochastic setting, we first derive problemspecific regret lower bounds, and analyze how these bounds scale with the dimension of the decision space. We then propose COMBUCB, algorithms that efficiently exploit the combinatorial structure of the problem, and derive finitetime upper bound on thei...

متن کامل

Mistake Bounds on Noise-Free Multi-Armed Bandit Game

We study the {0, 1}-loss version of adaptive adversarial multi-armed bandit problems with α(≥ 1) lossless arms. For the problem, we show a tight bound K − α − Θ(1/T ) on the minimax expected number of mistakes (1-losses), where K is the number of arms and T is the number of rounds.

متن کامل

Noise Free Multi-armed Bandit Game

We study the loss version of adversarial multi-armed bandit problems with one lossless arm. We show an adversary’s strategy that forces any player to suffer K − 1− O(1/T ) loss where K is the number of arms and T is the number of rounds.

متن کامل

A Relative Exponential Weighing Algorithm for Adversarial Utility-based Dueling Bandits (extended version)

We study the K-armed dueling bandit problem which is a variation of the classical Multi-Armed Bandit (MAB) problem in which the learner receives only relative feedback about the selected pairs of arms. We propose an efficient algorithm called Relative Exponential-weight algorithm for Exploration and Exploitation (REX3) to handle the adversarial utility-based formulation of this problem. We prov...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2017

Achieving Privacy in the Adversarial Multi-Armed Bandit

نویسندگان

چکیده

منابع مشابه

Online Linear Optimization through the Differential Privacy Lens

Stochastic and Adversarial Combinatorial Bandits

Mistake Bounds on Noise-Free Multi-Armed Bandit Game

Noise Free Multi-armed Bandit Game

A Relative Exponential Weighing Algorithm for Adversarial Utility-based Dueling Bandits (extended version)

عنوان ژورنال:

اشتراک گذاری